model class reliance
- Europe > United Kingdom > England > Nottinghamshire > Nottingham (0.14)
- North America > United States > Wisconsin (0.04)
- North America > United States > Florida > Broward County (0.04)
- (3 more...)
- Information Technology > Data Science > Data Mining (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.30)
- Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.30)
- Europe > United Kingdom > England > Nottinghamshire > Nottingham (0.14)
- North America > United States > Wisconsin (0.04)
- North America > United States > Florida > Broward County (0.04)
- (4 more...)
- Information Technology > Data Science > Data Mining (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.30)
- Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.30)
Review for NeurIPS paper: Model Class Reliance for Random Forests
Weaknesses: The main concern I have with the paper is in the argument that the estimator does in fact converge to MCR and MCR- for random forests. Section 4.1 provides an argument that, as the number of trees goes to infinity, each tree will be replaced with one from its Rashomon set that is maximally dependent on X1 (when doing the MCR procedure). In finite samples, and with a finite number of trees, there are reasons to doubt whether this method provides consistent estimation of MCR for the random forest as a whole. The favorable generalization properties of random forests are known to be derived from the diversity of trees in the ensemble: a well known result of Breiman is that the generalization error decreases as the correlation of the residuals from the trees decreases. While the predictions of a tree and its surrogate may be identical for a given dataset, replacing a tree with the surrogate seems that it may decrease the expected generalization error of the tree as a whole.
Review for NeurIPS paper: Model Class Reliance for Random Forests
This is a relevant and timely paper that has been reviewed by four knowledgeable referees, who also thoroughly considered the author's response to their initial reviews. Three of these reviewers recommend acceptance, providing detailed suggestions on how to improve this work before its final submission. This dissenting opinion was upheld by R3 after discussion with other referees. R3 in my opinion correctly brings up that if the proposed approach aims to improve runtime with an approximate algorithm, this must be sufficiently demonstrated in experiments vs. straightforward alternatives (such as retraining-based methods). That has not been done in the original submission neither in the rebuttal.
Model Class Reliance for Random Forests
Variable Importance (VI) has traditionally been cast as the process of estimating each variables contribution to a predictive model's overall performance. Recent research has sought to address this concern via analysis of Rashomon sets - sets of alternative model instances that exhibit equivalent predictive performance to some reference model, but which take different functional forms. Measures such as Model Class Reliance (MCR) have been proposed, that are computed against Rashomon sets, in order to ascertain how much a variable must be relied on to make robust predictions, or whether alternatives exist. If MCR range is tight, we have no choice but to use a variable; if range is high then there exists competing, perhaps fairer models, that provide alternative explanations of the phenomena being examined. Applications are wide, from enabling construction of fairer' models in areas such as recidivism, health analytics and ethical marketing.